Dissecting Human Pre-Editing toward Better Use of Off-the-Shelf Machine Translation Systems

نویسندگان

  • Rei Miyata
  • Atsushi Fujita
چکیده

Machine translation (MT) systems are not able to always produce translations of human-level quality. As a practical means of such MT systems, we investigated the potential of pre-editing strategy, by collecting actual pre-edit instances using a human-in-the-loop protocol. In our study, targeting Japanese-to-English translation on four different datasets and using an offthe-shelf MT system, we collected a total of 12,287 pre-edit instances for 400 source sentences and showed promising results; more than 85% of source sentences turned out to be accurately translated by the MT system. We also found that the pre-edited Japanese source sentences were better translated into Chinese and Korean, confirming the usefulness of pre-editing strategy in a multilingual setting. Through decomposing the collected pre-edit instances, we built a typology of primitive edit operations comprising 53 types, which unveils the subjects for further research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Teaching MT Through Pre-editing: Three Case Studies

This article reports on three cases of teaching translation or English as a foreign language using pre-editing tasks with a machine translation system. Trainee translators or English learners were asked to input a Chinese or English paragraph into an MT system, observe the irregularities in the output, and subsequently edit the source text and input it again in the hope of getting better output...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Toward an MT System without Pre-Editing - Effects of New Methods in ALT-J/E -

Recently, several types of Japanese to English MT (machine translation) systems have been developed, but prior to using such systems, they have required a pre-editing process of re-writing the original text into Japanese that could be easily translated. For communication of translated information requiring speed in dissemination, application of these systems would necessarily pose problems. To ...

متن کامل

Machine Translation already does Work

The first difficulty in answering a question like "Does machine translation work is that the question itself is ill-posed. It takes for granted that there is one single thing called machine translation and that everyone is agreed about what it is. But in fact, even a cursory glance at the systems already around, either in regular operational use or under development, will reveal a wide range of...

متن کامل

New Approaches to Machine Translation

The current resurgence of interest in machine translation is partially attributable to the emergence of a variety of new paradigms, ranging from better translation aids and improved pre and post-editing methods, to highly interactive approaches and fully automated knowledge-based systems. This paper discusses each basic approach and provides some comparative analysis. It is argued that both int...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017